Robust Classification with Interval Data
نویسندگان
چکیده
We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyper-rectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifier in this setting by minimizing the worst-case value of a given loss function, over all possible choices of the data in these multi-dimensional intervals. We examine in detail the application of this methodology to three specific loss functions, arising in support vector machines, in logistic regression and in minimax probability machines. We show that in each case, the resulting problem is amenable to efficient interior-point algorithms for convex optimization. The methods tend to produce sparse classifiers, i.e., they induce many zero coefficients in the resulting weight vectors, and we provide some theoretical grounds for this property. After presenting possible extensions of this framework to handle label errors and other uncertainty models, we discuss in some detail our implementation, which exploits the potential sparsity or a more general property referred to as regularity, of the input matrices.
منابع مشابه
Robust Optimization and Confidence Interval DEA for Efficiency Evaluation with Intervals Case Study: Evaluating CRM Units in a Call Center in Tehran
متن کامل
Proposing a Robust Model of Interval Data Envelopment Analysis to Performance Measurement under Double Uncertainty Situations
It is very necessary to consider the uncertainty in the data and how to deal with it when performance measurement using data envelopment analysis. Because a little deviation in the data can lead to a significant change in the performance results. However, in the real world and in many cases, the data is uncertain. Interval data envelopment analysis is one of the most widely used approaches to d...
متن کاملA Bootstrap Interval Robust Data Envelopment Analysis for Estimate Efficiency and Ranking Hospitals
Data envelopment analysis (DEA) is one of non-parametric methods for evaluating efficiency of each unit. Limited resources in healthcare economy is the main reason in measuring efficiency of hospitals. In this study, a bootstrap interval data envelopment analysis (BIRDEA) is proposed for measuring the efficiency of hospitals affiliated with the Hamedan University of Medical Sciences. The propos...
متن کاملInterval network data envelopment analysis model for classification of investment companies in the presence of uncertain data
The main purpose of this paper is to propose an approach for performance measurement, classification and ranking the investment companies (ICs) by considering internal structure and uncertainty. In order to reach this goal, the interval network data envelopment analysis (INDEA) models are extended. This model is capable to model two-stage efficiency with intermediate measures i...
متن کاملIntelligent and Robust Genetic Algorithm Based Classifier
The concepts of robust classification and intelligently controlling the search process of genetic algorithm (GA) are introduced and integrated with a conventional genetic classifier for development of a new version of it, which is called Intelligent and Robust GA-classifier (IRGA-classifier). It can efficiently approximate the decision hyperplanes in the feature space. It is shown experime...
متن کاملMulti-Group Classification Using Interval Linea rProgramming
Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis has recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks and support vector machine. MDLP is less compli...
متن کامل